Search CORE

58 research outputs found

Approximation theory of transformer networks for sequence modeling

Author: Jiang Haotian
Li Qianxiao
Publication venue
Publication date: 29/05/2023
Field of study

The transformer is a widely applied architecture in sequence modeling applications, but the theoretical understanding of its working principles is limited. In this work, we investigate the ability of transformers to approximate sequential relationships. We first prove a universal approximation theorem for the transformer hypothesis space. From its derivation, we identify a novel notion of regularity under which we can prove an explicit approximation rate estimate. This estimate reveals key structural properties of the transformer and suggests the types of sequence relationships that the transformer is adapted to approximating. In particular, it allows us to concretely discuss the structural bias between the transformer and classical sequence modeling methods, such as recurrent neural networks. Our findings are supported by numerical experiments

arXiv.org e-Print Archive

On Matching, and Even Rectifying, Dynamical Systems through Koopman Operator Eigenfunctions

Author: Bollt Erik M.
Dietrich Felix
Kevrekidis Ioannis
Li Qianxiao
Publication venue
Publication date: 06/03/2018
Field of study

Matching dynamical systems, through different forms of conjugacies and equivalences, has long been a fundamental concept, and a powerful tool, in the study and classification of nonlinear dynamic behavior (e.g. through normal forms). In this paper we will argue that the use of the Koopman operator and its spectrum is particularly well suited for this endeavor, both in theory, but also especially in view of recent data-driven algorithm developments. We believe, and document through illustrative examples, that this can nontrivially extend the use and applicability of the Koopman spectral theoretical and computational machinery beyond modeling and prediction, towards what can be considered as a systematic discovery of "Cole-Hopf-type" transformations for dynamics.Comment: 34 pages, 10 figure

arXiv.org e-Print Archive

ScholarBank@NUS